Stable choice coding in rat frontal orienting fields across model-predicted changes of mind

During decision making in a changing environment, evidence that may guide the decision accumulates until the point of action. In the rat, provisional choice is thought to be represented in frontal orienting fields (FOF), but this has only been tested in static environments where provisional and final decisions are not easily dissociated. Here, we characterize the representation of accumulated evidence in the FOF of rats performing a recently developed dynamic evidence accumulation task, which induces changes in the provisional decision, referred to as “changes of mind”. We find that FOF encodes evidence throughout decision formation with a temporal gain modulation that rises until the period when the animal may need to act. Furthermore, reversals in FOF firing rates can be accounted for by changes of mind predicted using a model of the decision process fit only to behavioral data. Our results suggest that the FOF represents provisional decisions even in dynamic, uncertain environments, allowing for rapid motor execution when it is time to act.


Stationary Dynamic
Supplementary Figure 1: Accumulation model parameters Best fit model parameters and 95% confidence intervals for each rat in this study. In addition the model parameter fits reported in 1 for 19 rats in a stationary environment are included for comparison. Brunton et al., 1 also included sticky (absorbing) evidence boundaries as an additional parameter. Consistent with 2 , we did not include this absorbing parameter because it was previously found not to improve model fits 1 and because the large discounting rate λ prevents the boundaries from influencing the model. The difference in lapse parameters may be explained by the large evidence discounting rates λ, the lack of absorbing bounds in the model, or simply a difference in rat behavior. Source data are provided as a Source Data file.     3 Supplementary Methods

Accumulation model posterior distribution
To compute the accumulation model posterior distribution used for neural data analysis (referred to as the "backward pass distribution" in 1 ), we first found the parameter set θ that best explained each rat's choices y under the forward model f (a) by maximizing P (y|θ), per equation 16 of the main text. Then, we use the best fit parameter set to evaluate the forward model for each trial, producing a probability distribution over accumulated evidence value at each time point given the initial accumulation value a 0 ∼ N (0, σ 2 i ). The posterior distribution p(a) combines this constraint on initial conditions with a constraint on the final conditions given by the rat's choice, using a backward distribution b(a) to impose the constraint on the final conditions of the accumulation value distribution. The backward model makes no assumptions about the initial conditions, but instead constrains the final conditions to be consistent with the rat's choice at the end of the trial t N .
Importantly, the forward and backward distributions are conditionally independent, conditioned on the final value of the accumulated evidence. Given that they are independent, the posterior distribution that combines both observations is the product of the forward and backward distributions.
It is important to note here that the adaptation process C is deterministic, and its evolution does not depend on the stochastic per-click noise realizations. We can consider this as an upstream sensory adaptation that happens before the integration process. As a result, the adapted clicks can be included in the forward and backward distributions.
One technical wrinkle is that our analytical solution for the model relies on initial conditions that are Gaussian. This presents no problem for the forward model, which assumes a Gaussian initial distribution a 0 ∼ N (0, σ 2 i ). However, when we compute the backward model, we begin at the final time point, where the final accumulation value is constrained to be on the side of the decision boundary corresponding to the animal's choice: From this final time point, the model evolves backwards in time in response to the adapted clicks. This constraint on the final accumulation value can be thought of as a constraint on the initial conditions of the backward model, presenting a problem for our solution's dependence on Gaussian initial conditions. To solve this, we constructed a solution by discretizing the a-value axis into small bins of width ∆a, and solved the backward model for each bin assuming a delta function of initial probability mass in each bin. We refer to the backward distribution from each bin j as the delta-backward solution b j (a). Our entire backward distribution is the mixture distribution over all the individual delta-backward solutions.
To solve each delta-backward solution b j (a), we use the time-reversed solution to the forward model. The mixture weights w j are all equal if the bin spacing is uniform. Note that it might be tempting to think that we need to weight each individual delta-backward solution by the forward model's probability mass in each bin; however, this is not correct. Given that the backward model is independent of the forward model, we want the complete backward distribution to reflect all possible states consistent with the choice observation, which is the uniform distribution over the correct sign of a. With a set of b j (a) solutions, we can now combine them into the posterior distribution, p(a). The exact solution as ∆a → 0 is given by: In practice we truncate the infinite series at a suitable extrema value of a, and use a finite bin spacing ∆a = 1. On each trial, the extent of the grid of delta solutions was determined by finding the accumulation value where less than 1e − 4 probability mass of the model posterior lay beyond that point at the end of the trial. Given that f (a), and b j (a) will be Gaussian, let p j (a) = f (a)b j (a), which is also Gaussian. This lets us write the posterior distribution as the sum of many delta-posterior modes.

Toy model demonstration
To illustrate this solution to the model posterior, consider a simple random walk process. On each time step there is a 1/3 probability of staying in place, 1/3 probability of taking a step of size +1, and a 1/3 probability of taking a step of size −1. We start at a(t = 0) = 0, and then observe the process ten time steps later, t = 10. We now want to compute the forward model, and the posterior distribution for this data.
Using a binomial distribution we can analytically compute the forward model, which describes the probability of observing the process given the initial conditions and an elapsed duration (Fig. 7A). To compute a backward model, we must define the final value. For example, if we let the final value of the random walk be a(t = 10) = 2, then we can again use the binomial distribution to compute the distribution of possible values at earlier time steps (Fig. 7B). We can alternatively let the final value be the sign of the random walk process and compute the full backward distribution. Combining the independent forward and backward distributions, we can predict the posterior distribution, which is all possible states the random walk could be in given these two observations (Fig. 7E). We can check this against a particle simulation by sampling 2, 000, 000 trajectories from the random walk process to get the forward distribution (Fig. 7C). To get the posterior distribution we filter our samples for trajectories that ended up with positive value (Fig. 7F). We can compare these two distributions by taking a slice in time (Fig. 7D). We can now move on to the accumulation model, which has the same basic random walk structure, but with a few more bells and whistles.

Verification of accumulation model posterior solution
To illustrate the model posterior solution for the full accumulation model we can again compare the analytical solution to sampled trajectories, this time for an example trial. Each trajectory has a unique noise realization. For both the model and sample trajectories, we have four distributions (Fig. 8). The forward distribution showing the predicted trajectories given the initial conditions. The backward-delta distribution showing the possible trajectories that result in a single final accumulation value. The posterior-delta distribution showing the possible trajectories that start at the initial condition, as in the forward distribution, and result in a single final accumulation value, as in the backwarddelta distribution. Finally, the full posterior distribution that shows the possible accumulation values that start at the initial condition and result in the appropriate choice or sign of the accumulation value. In this example trial we consider a trial where there is change in state at 750ms, and we evaluate a left choice (a < B) at t = 1s. The entire solution was computed using 51 backward-delta solutions on a grid from (−50, B) with ∆a = 1. We can also examine slices through the posterior distribution at various time points to confirm agreement between the trajectories and the analytical solution (Fig. 9).
A few notes on the advantages of the analytical solution. First, the analytical model offers a large increase in accuracy of the model over previous numerical approaches. Second, the analytical model is much faster to fit and evaluate. Second, we can compute the posterior distribution for only a subset of all time points without computing the solution for all time points. This fact allows for very rapid computation of the posterior distribution.

Model State Changes
We predicted the timing of changes of mind using the time points within a trial when the mean of the posterior distribution p(a) = P (a|t, δ R , δ L , θ, a 0 ∼ N (0, σ 2 i ), y) crossed the decision boundary B. These trajectories have several sources of noise that complicated our analysis. First, they have sharp discontinuities at the times of each stimulus click. This sometimes resulted in the mean trajectory repeatedly crossing the decision boundary in a short period of time. It is unlikely that each of these crossings corresponded to a true change of mind, since the subject was likely in a general state of indecision (Fig. 11A). To resolve this issue we smoothed the mean trajectories with a 100 ms running average. This smoothing resolved the flickering model state changes from individual trials (Fig. 11B). However, there were still model state changes that only briefly crossed or oscillated around the decision boundary.To detect and remove these time points we estimated the local slope of the smoothed trajectory, and filtered out model state changes where the local slope had an inconsistent sign with the direction of the state change (Fig. 11C). We excluded any state change that occurred in the first or last 200ms of the trial.